Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid

نویسنده

  • Ron Kohavi
چکیده

Naive-Bayes induction algorithms were previously shown to be surprisingly accurate on many classii-cation tasks even when the conditional independence assumption on which they are based is violated. However , most studies were done on small databases. We show that in some larger databases, the accuracy of Naive-Bayes does not scale up as well as decision trees. We then propose a new algorithm, NBTree, which induces a hybrid of decision-tree classiiers and Naive-Bayes classiiers: the decision-tree nodes contain uni-variate splits as regular decision-trees, but the leaves contain Naive-Bayesian classiiers. The approach retains the interpretability of Naive-Bayes and decision trees, while resulting in classiiers that frequently out-perform both constituents, especially in the larger databases tested.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Int Reduction

Naive-Bayes induction algorithms were previously shown to be surprisingly accurate on many classification tasks even when the conditional independence assumption on which they are based is violated. However, most studies were done on small databases. We show that in some larger databases, the accuracy of Naive-Bayes does not scale up as well as decision trees. We then propose a new algorithm, N...

متن کامل

Scaling Up the Accuracy of Naive Bayes Classi ers a Decision Tree Hybrid

Naive Bayes induction algorithms were previously shown to be surprisingly accurate on many classi cation tasks even when the conditional independence assumption on which they are based is violated How ever most studies were done on small databases We show that in some larger databases the accuracy of Naive Bayes does not scale up as well as decision trees We then propose a new algorithm NBTree ...

متن کامل

Attribute Normalization Techniques and Performance of Intrusion Classifiers: A Comparative Analysis

Network traffic have several attributes with different range of values. These attributes can be qualitative or quantitative in nature. Attributes with large values significantly influence the performance of intrusion classifier making it bias towards them. Attribute normalization eliminates such dominance of the attributes by scaling the values of all the attributes within a specific range. The...

متن کامل

Fast Perceptron Decision Tree Learning from Evolving Data Streams

Mining of data streams must balance three evaluation dimensions: accuracy, time and memory. Excellent accuracy on data streams has been obtained with Naive Bayes Hoeffding Trees—Hoeffding Trees with naive Bayes models at the leaf nodes—albeit with increased runtime compared to standard Hoeffding Trees. In this paper, we show that runtime can be reduced by replacing naive Bayes with perceptron c...

متن کامل

Scaling Up the Accuracy of Decision-Tree Classifiers: A Naive-Bayes Combination

C4.5 and NB are two of the top 10 algorithms in data mining thanks to their simplicity, effectiveness, and efficiency. In order to integrate their advantages, NBTree builds a naive Bayes classifier on each leaf node of the built decision tree. NBTree significantly outperforms C4.5 and NB in terms of classification accuracy. However, it incurs very high time complexity. In this paper, we propose...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996